Search CORE

50 research outputs found

Semi-supervised class discovery using quantitative phenotypes – CVD as a case study

Author: Diego Ardigò
Israel Steinfeld
Ivana Zavaroni
Roy Navon
Zohar Yakhini
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Clinically driven semi-supervised class discovery in gene expression data

Author: Diego Ardigò
Israel Steinfeld
Ivana Zavaroni
Roy Navon
Zohar Yakhini
Publication venue
Publication date: 09/08/2008
Field of study

Abstract Motivation: Unsupervised class discovery in gene expression data relies on the statistical signals in the data to exclusively drive the results. It is often the case, however, that one is interested in constraining the search space to respect certain biological prior knowledge while still allowing a flexible search within these boundaries. Results: We develop an approach to semi-supervised class discovery. One component of our approach uses clinical sample information to constrain the search space and guide the class discovery process to yield biologically relevant partitions. A second component consists of using known biological annotation of genes to drive the search, seeking partitions that manifest strong differential expression in specific sets of genes. We develop efficient algorithmics for these tasks, implementing both approaches and combinations thereof. We show that our method is robust enough to detect known clinical parameters in accordance with expected clinical values. We also use our method to elucidate cardiovascular disease (CVD) putative risk factors. Availability: MonoClaD (Monotone Class Discovery). See http://bioinfo.cs.technion.ac.il/people/zohar/MonoClad/ Supplementary information: Supplementary data is available at http://bioinfo.cs.technion.ac.il/people/zohar/MonoClad/software.html Contact: [email protected]

Open Access Repository

Novel Rank-Based Statistical Methods Reveal MicroRNAs with Differential Expression in Multiple Cancer Types

Author: Ben-Dor Amir
Navon Roy
Steinfeld Israel
Tsalenko Anya
Wang Hui
Yakhini Zohar
Publication venue: Public Library of Science
Publication date: 01/11/2009
Field of study

BACKGROUND:MicroRNAs (miRNAs) regulate target genes at the post-transcriptional level and play important roles in cancer pathogenesis and development. Variation amongst individuals is a significant confounding factor in miRNA (or other) expression studies. The true character of biologically or clinically meaningful differential expression can be obscured by inter-patient variation. In this study we aim to identify miRNAs with consistent differential expression in multiple tumor types using a novel data analysis approach. METHODS:Using microarrays we profiled the expression of more than 700 miRNAs in 28 matched tumor/normal samples from 8 different tumor types (breast, colon, liver, lung, lymphoma, ovary, prostate and testis). This set is unique in putting emphasis on minimizing tissue type and patient related variability using normal and tumor samples from the same patient. We develop scores for comparing miRNA expression in the above matched sample data based on a rigorous characterization of the distribution of order statistics over a discrete state set, including exact p-values. Specifically, we compute a Rank Consistency Score (RCoS) for every miRNA measured in our data. Our methods are also applicable in various other contexts. We compare our methods, as applied to matched samples, to paired t-test and to the Wilcoxon Signed Rank test. RESULTS:We identify consistent (across the cancer types measured) differentially expressed miRNAs. 41 miRNAs are under-expressed in cancer compared to normal, at FDR (False Discovery Rate) of 0.05 and 17 are over-expressed at the same FDR level. Differentially expressed miRNAs include known oncomiRs (e.g miR-96) as well as miRNAs that were not previously universally associated with cancer. Specific examples include miR-133b and miR-486-5p, which are consistently down regulated and mir-629* which is consistently up regulated in cancer, in the context of our cohort. Data is available in GEO. Software is available at: http://bioinfo.cs.technion.ac.il/people/zohar/RCoS

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

EXPANDER – an integrative program suite for microarray data analysis

Author: Elkon Ran
Linhart Chaim
Maron-Katz Adi
Shamir Ron
Sharan Roded
Shiloh Yosef
Steinfeld Israel
Tanay Amos
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Gene expression microarrays are a prominent experimental tool in functional genomics which has opened the opportunity for gaining global, systems-level understanding of transcriptional networks. Experiments that apply this technology typically generate overwhelming volumes of data, unprecedented in biological research. Therefore the task of mining meaningful biological knowledge out of the raw data is a major challenge in bioinformatics. Of special need are integrative packages that provide biologist users with advanced but yet easy to use, set of algorithms, together covering the whole range of steps in microarray data analysis. RESULTS: Here we present the EXPANDER 2.0 (EXPression ANalyzer and DisplayER) software package. EXPANDER 2.0 is an integrative package for the analysis of gene expression data, designed as a 'one-stop shop' tool that implements various data analysis algorithms ranging from the initial steps of normalization and filtering, through clustering and biclustering, to high-level functional enrichment analysis that points to biological processes that are active in the examined conditions, and to promoter cis-regulatory elements analysis that elucidates transcription factors that control the observed transcriptional response. EXPANDER is available with pre-compiled functional Gene Ontology (GO) and promoter sequence-derived data files for yeast, worm, fly, rat, mouse and human, supporting high-level analysis applied to data obtained from these six organisms. CONCLUSION: EXPANDER integrated capabilities and its built-in support of multiple organisms make it a very powerful tool for analysis of microarray data. The package is freely available for academic users a

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists

Author: A Subramanian
B Zeeberg
B Zhang
C Backes
Doron Lipson
E Eden
E Gansner
EI Boyle
Eran Eden
F Al-Shahrour
F Al-Shahrour
GD Jr
Israel Steinfeld
JJJ Goeman
LJ van't Veer
M Ashburner
P Khatri
Q Xu
QWX Zheng
R Breitling
R Sealfon
Roy Navon
S Maere
TST Beissbarth
Zohar Yakhini
Publication venue: BioMed Central
Publication date: 01/02/2009
Field of study

Abstract Background Since the inception of the GO annotation project, a variety of tools have been developed that support exploring and searching the GO database. In particular, a variety of tools that perform GO enrichment analysis are currently available. Most of these tools require as input a target set of genes and a background set and seek enrichment in the target set compared to the background set. A few tools also exist that support analyzing ranked lists. The latter typically rely on simulations or on union-bound correction for assigning statistical significance to the results. Results <it>GOrilla </it>is a web-based application that identifies enriched GO terms in ranked lists of genes, without requiring the user to provide explicit target and background sets. This is particularly useful in many typical cases where genomic data may be naturally represented as a ranked list of genes (e.g. by level of expression or of differential expression). <it>GOrilla </it>employs a flexible threshold statistical approach to discover GO terms that are significantly enriched at the <it>top </it>of a ranked gene list. Building on a complete theoretical characterization of the underlying distribution, called mHG, <it>GOrilla </it>computes an exact p-value for the observed enrichment, taking threshold multiple testing into account without the need for simulations. This enables rigorous statistical analysis of thousand of genes and thousands of GO terms in order of seconds. The output of the enrichment analysis is visualized as a hierarchical structure, providing a clear view of the relations between enriched GO terms. Conclusion <it>GOrilla </it>is an efficient GO analysis tool with unique features that make a useful addition to the existing repertoire of GO enrichment tools. <it>GOrilla</it>'s unique features and advantages over other threshold free enrichment tools include rigorous statistics, fast running time and an effective graphical representation. <it>GOrilla </it>is publicly available at: <url>http://cbl-gorilla.cs.technion.ac.il</url></p

Crossref

Directory of Open Access Journals

PubMed Central

miRNA-mRNA Integrated Analysis Reveals Roles for miRNAs in Primary Breast Tumors

Author: Aure Miriam R.
Borresen-Dale Anne-Lise
Enerly Espen
Johnsen Hilde
Kallioniemi Olli
Kleivi Kristine
Kristensen Vessela N.
Leivonen Suvi-Katri
Makela Rami
Naume Bjorn
Navon Roy
Perala Merja
Rodland Einar
Ronneberg Jo Anders
Russnes Hege G.
Steinfeld Israel
Yakhini Zohar
Publication venue
Publication date: 01/01/2011
Field of study

Peer reviewe

Directory of Open Access Journals

PubMed Central

VTT Research System

Helsingin yliopiston digitaalinen arkisto

Small Deletion Variants Have Stable Breakpoints Commonly Associated with Alu Elements

Author: A Bacolla
Adam J. de Smith
AFA Smit
AJ de Smith
AJ Iafrate
AJ Sharp
Alexandra I. F. Blakemore
BE Stranger
CY Chan
D Karolchik
D Karolchik
DA Hinds
DP Locke
E Eden
E Gonzalez
E Tuzun
EV Linardopoulou
GH Perry
GJ Cost
GM Cooper
Israel Steinfeld
J Sebat
J Sebat
JA Lee
JC Barrett
JO Korbel
K Han
K Lee
KK Wong
Lachlan J. M. Coin
M Dewannieux
M Dewannieux
M Fanciulli
M Krawczak
Michael Lichten
P Scheet
PA Callinan
Philippe Froguel
PM Kim
R Chenna
R Redon
RD Wells
Rob Sladek
Robin G. Walters
S Gonzalez-Barrera
S Rozen
SA McCarroll
SK Sen
TJ Hubbard
Zohar Yakhini
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Copy number variants (CNVs) contribute significantly to human genomic variation, with over 5000 loci reported, covering more than 18% of the euchromatic human genome. Little is known, however, about the origin and stability of variants of different size and complexity. We investigated the breakpoints of 20 small, common deletions, representing a subset of those originally identified by array CGH, using Agilent microarrays, in 50 healthy French Caucasian subjects. By sequencing PCR products amplified using primers designed to span the deleted regions, we determined the exact size and genomic position of the deletions in all affected samples. For each deletion studied, all individuals carrying the deletion share identical upstream and downstream breakpoints at the sequence level, suggesting that the deletion event occurred just once and later became common in the population. This is supported by linkage disequilibrium (LD) analysis, which has revealed that most of the deletions studied are in moderate to strong LD with surrounding SNPs, and have conserved long-range haplotypes. Analysis of the sequences flanking the deletion breakpoints revealed an enrichment of microhomology at the breakpoint junctions. More significantly, we found an enrichment of Alu repeat elements, the overwhelming majority of which intersected deletion breakpoints at their poly-A tails. We found no enrichment of LINE elements or segmental duplications, in contrast to other reports. Sequence analysis revealed enrichment of a conserved motif in the sequences surrounding the deletion breakpoints, although whether this motif has any mechanistic role in the formation of some deletions has yet to be determined. Considered together with existing information on more complex inherited variant regions, and reports of de novo variants associated with autism, these data support the presence of different subgroups of CNV in the genome which may have originated through different mechanisms

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Oxford University Research Archive

University of Melbourne Institutional Repository

Brunel University Research Archive

University of Queensland eSpace

Global Methylation Patterns in Idiopathic Pulmonary Fibrosis

Author: AH Ting
AL Katzenstein
AS Wilson
BC Willis
C Vancheri
CJ Scotton
CL Fattman
DA Schwartz
DC Dolinoy
DJ Weisenberger
E Dudziec
E Li
E Segal
Einat I. Rabinovich
F Mohn
G Howard
G Liu
GJ Faulkner
GP Pfeifer
Guoying Yu
H Sanjo
HR Collard
I Le Jeune
IO Rosas
Israel Steinfeld
JH Kim
JK Choi
JS Han
K Boon
K Chalitchagorn
K Konishi
Kevin F. Gibson
KJ Livak
KK Kim
KK Kim
Kusum V. Pandit
KV Pandit
L Hecker
LJ Vuga
M Ehrich
M Klug
M Pang
M Selman
M Selman
M Selman
M Selman
M Toyota
M Weber
Maria G. Kapetanaki
MM Suzuki
MP Keane
MR Estecio
Naftali Kaminski
O Lababede
Oliver Eickelberg
P Chomczynski
PA Jones
PA Jones
R Jaenisch
R Rajkumar
R Straussman
RA Irizarry
RA Waterland
RK Slotkin
S Yamashita
SK Huang
T Liu
T Rauch
TA Rauch
TJ Gross
V Taskar
VJ Thannickal
VS Taskar
W Huang da
WJ Kent
Y Benjamini
Y Ozawa
YY Sanders
Zohar Yakhini
Publication venue: Public Library of Science
Publication date: 10/04/2012
Field of study

BACKGROUND: Idiopathic Pulmonary Fibrosis (IPF) is characterized by profound changes in the lung phenotype including excessive extracellular matrix deposition, myofibroblast foci, alveolar epithelial cell hyperplasia and extensive remodeling. The role of epigenetic changes in determining the lung phenotype in IPF is unknown. In this study we determine whether IPF lungs exhibit an altered global methylation profile.\ud \ud METHODOLOGY/PRINCIPAL FINDINGS: Immunoprecipitated methylated DNA from 12 IPF lungs, 10 lung adenocarcinomas and 10 normal histology lungs was hybridized to Agilent human CpG Islands Microarrays and data analysis was performed using BRB-Array Tools and DAVID Bioinformatics Resources software packages. Array results were validated using the EpiTYPER MassARRAY platform for 3 CpG islands. 625 CpG islands were differentially methylated between IPF and control lungs with an estimated False Discovery Rate less than 5%. The genes associated with the differentially methylated CpG islands are involved in regulation of apoptosis, morphogenesis and cellular biosynthetic processes. The expression of three genes (STK17B, STK3 and HIST1H2AH) with hypomethylated promoters was increased in IPF lungs. Comparison of IPF methylation patterns to lung cancer or control samples, revealed that IPF lungs display an intermediate methylation profile, partly similar to lung cancer and partly similar to control with 402 differentially methylated CpG islands overlapping between IPF and cancer. Despite their similarity to cancer, IPF lungs did not exhibit hypomethylation of long interspersed nuclear element 1 (LINE-1) retrotransposon while lung cancer samples did, suggesting that the global hypomethylation observed in cancer was not typical of IPF.\ud \ud CONCLUSIONS/SIGNIFICANCE: Our results provide evidence that epigenetic changes in IPF are widespread and potentially important. The partial similarity to cancer may signify similar pathogenetic mechanisms while the differences constitute IPF or cancer specific changes. Elucidating the role of these specific changes will potentially allow better understanding of the pathogenesis of IPF.\ud \u

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

D-Scholarship@Pitt

FigShare